22 research outputs found
Data-driven modeling of collaboration networks: A cross-domain analysis
We analyze large-scale data sets about collaborations from two different
domains: economics, specifically 22.000 R&D alliances between 14.500 firms, and
science, specifically 300.000 co-authorship relations between 95.000
scientists. Considering the different domains of the data sets, we address two
questions: (a) to what extent do the collaboration networks reconstructed from
the data share common structural features, and (b) can their structure be
reproduced by the same agent-based model. In our data-driven modeling approach
we use aggregated network data to calibrate the probabilities at which agents
establish collaborations with either newcomers or established agents. The model
is then validated by its ability to reproduce network features not used for
calibration, including distributions of degrees, path lengths, local clustering
coefficients and sizes of disconnected components. Emphasis is put on comparing
domains, but also sub-domains (economic sectors, scientific specializations).
Interpreting the link probabilities as strategies for link formation, we find
that in R&D collaborations newcomers prefer links with established agents,
while in co-authorship relations newcomers prefer links with other newcomers.
Our results shed new light on the long-standing question about the role of
endogenous and exogenous factors (i.e., different information available to the
initiator of a collaboration) in network formation.Comment: 25 pages, 13 figures, 4 table
Quantifying knowledge exchange in R&D networks: A data-driven model
We propose a model that reflects two important processes in R&D activities of
firms, the formation of R&D alliances and the exchange of knowledge as a result
of these collaborations. In a data-driven approach, we analyze two large-scale
data sets extracting unique information about 7500 R&D alliances and 5200
patent portfolios of firms. This data is used to calibrate the model parameters
for network formation and knowledge exchange. We obtain probabilities for
incumbent and newcomer firms to link to other incumbents or newcomers which are
able to reproduce the topology of the empirical R&D network. The position of
firms in a knowledge space is obtained from their patents using two different
classification schemes, IPC in 8 dimensions and ISI-OST-INPI in 35 dimensions.
Our dynamics of knowledge exchange assumes that collaborating firms approach
each other in knowledge space at a rate for an alliance duration .
Both parameters are obtained in two different ways, by comparing knowledge
distances from simulations and empirics and by analyzing the collaboration
efficiency . This is a new measure, that takes also in
account the effort of firms to maintain concurrent alliances, and is evaluated
via extensive computer simulations. We find that R&D alliances have a duration
of around two years and that the subsequent knowledge exchange occurs at a very
low rate. Hence, a firm's position in the knowledge space is rather a
determinant than a consequence of its R&D alliances. From our data-driven
approach we also find model configurations that can be both realistic and
optimized with respect to the collaboration efficiency .
Effective policies, as suggested by our model, would incentivize shorter R&D
alliances and higher knowledge exchange rates.Comment: 35 pages, 10 figure
Quantifying and suppressing ranking bias in a large citation network
It is widely recognized that citation counts for papers from different fields cannot be directly compared because different scientific fields adopt different citation practices. Citation counts are also strongly biased by paper age since older papers had more time to attract citations. Various procedures aim at suppressing these biases and give rise to new normalized indicators, such as the relative citation count. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including the relative citation count and Google's PageRank score, are significantly biased by paper field and age. Our statistical framework to assess ranking bias allows us to exactly quantify the contributions of each individual field to the overall bias of a given ranking. We propose a general normalization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score
Reconstructing signed relations from interaction data
Positive and negative relations play an essential role in human behavior and
shape the communities we live in. Despite their importance, data about signed
relations is rare and commonly gathered through surveys. Interaction data is
more abundant, for instance, in the form of proximity or communication data. So
far, though, it could not be utilized to detect signed relations. In this
paper, we show how the underlying signed relations can be extracted with such
data. Employing a statistical network approach, we construct networks of signed
relations in four communities. We then show that these relations correspond to
the ones reported in surveys. Additionally, the inferred relations allow us to
study the homophily of individuals with respect to gender, religious beliefs,
and financial backgrounds. We evaluate the importance of triads in the signed
network to study group cohesion.Comment: 14 pages, 3 figures, submitte
Adapting to Disruptions: Flexibility as a Pillar of Supply Chain Resilience
Supply chain disruptions cause shortages of raw material and products. To
increase resilience, i.e., the ability to cope with shocks, substituting goods
in established supply chains can become an effective alternative to creating
new distribution links. We demonstrate its impact on supply deficits through a
detailed analysis of the US opioid distribution system. Reconstructing 40
billion empirical distribution paths, our data-driven model allows a unique
inspection of policies that increase the substitution flexibility. Our approach
enables policymakers to quantify the trade-off between increasing flexibility,
i.e., reduced supply deficits, and increasing complexity of the supply chain,
which could make it more expensive to operate
Modeling social resilience: Questions, answers, open problems
Resilience denotes the capacity of a system to withstand shocks and its
ability to recover from them. We develop a framework to quantify the resilience
of highly volatile, non-equilibrium social organizations, such as collectives
or collaborating teams. It consists of four steps: (i) \emph{delimitation},
i.e., narrowing down the target systems, (ii) \emph{conceptualization}, .e.,
identifying how to approach social organizations, (iii) formal
\emph{representation} using a combination of agent-based and network models,
(iv) \emph{operationalization}, i.e. specifying measures and demonstrating how
they enter the calculation of resilience. Our framework quantifies two
dimensions of resilience, the \emph{robustness} of social organizations and
their \emph{adaptivity}, and combines them in a novel resilience measure. It
allows monitoring resilience instantaneously using longitudinal data instead of
an ex-post evaluation
The structure, exchange, and transfer of knowledge in socio-technical systems
This thesis aims to improve our understanding of the role of knowledge in economics and science. We analyze collaboration activities in these two domains, and show how the interactions among firms and among scientists influence the structure and the exchange of knowledge. We also model how the knowledge of these actors defines their collaborations. We show that knowledge is not only a consequence, but also a determinant of collaborations. To capture this interplay, we combine a statistical analysis of patent and publication data with agent-based models of collaboration activities.
We follow a data-driven approach to study the structure, exchange, and transfer of knowledge.
Specifically, using publication data we proxy the structure of scientific knowledge by reconstructing the citation network between publications. On this network, we quantitatively show that citation patterns strongly differ across time and scientific fields. We also identify the different knowledge of scientists, and quantify their knowledge exchange occurring during collaborations. Similarly, we use patent data to identify firms' knowledge and the knowledge exchange between firms involved in R\&D alliances. Then, to study the transfer of knowledge, we re-construct scientists' career paths by tracing their affiliations reported on their publications. With these paths, we construct the global migration network of scientists at city level, and analyze its topological properties.
After analyzing collaborations activities, the exchange, and the transfer of knowledge, we reproduce these using agent-based models that we calibrate an validate against real-world data. In order to capture the very different processes behind these phenomena, we develop three different models.
Precisely, to model collaborations activities among firms and their subsequent knowledge exchange, we combine and extend two existing models that captured only one of these phenomena each.
Our a new model, instead, is able to simultaneously reproduce both these phenomena.
To show how the knowledge differences between scientists determine their collaboration activities,
we develop a second model that takes as input only these differences. Then, to model the transfer of knowledge, we develop a third agent-based model that reproduces scientists' migration at city level and the observed topological properties of the global migration network.
Finally, we show that citation patterns between journals and scientists' career paths are better modeled by a new mathematical framework defined by higher-order networks than by traditional network models. By this, we challenge the application of the traditional network perspective to model the flow of knowledge between journals and the transfer of knowledge across research institutes
When standard network measures fail to rank journals: A theoretical and empirical analysis
Journal rankings are widely used and are often based on citation data in combination with a network approach. We argue that some of these network-based rankings can produce misleading results. From a theoretical point of view, we show that the standard network modeling approach of citation data at the journal level (i.e., the projection of paper citations onto journals) introduces fictitious relations among journals. To overcome this problem, we propose a citation path approach, and empirically show that rankings based on the network and the citation path approach are very different. Specifically we use MEDLINE, the largest open-access bibliometric data set, listing 24,135 journals, 26,759,399 papers, and 323,356,788 citations. We focus on PageRank, an established and well-known network metric. Based on our theoretical and empirical analysis, we highlight the limitations of standard network metrics and propose a method to overcome them.ISSN:2641-333
Reproducing scientists’ mobility: a data-driven model
High skill labour is an important factor underpinning the competitive advantage of modern economies. Therefore, attracting and retaining scientists has become a major concern for migration policy. In this work, we study the migration of scientists on a global scale, by combining two large data sets covering the publications of 3.5 million scientists over 60 years. We analyse their geographical distances moved for a new affiliation and their age when moving, this way reconstructing their geographical “career paths”. These paths are used to derive the world network of scientists’ mobility between cities and to analyse its topological properties. We further develop and calibrate an agent-based model, such that it reproduces the empirical findings both at the level of scientists and of the global network. Our model takes into account that the academic hiring process is largely demand-driven and demonstrates that the probability of scientists to relocate decreases both with age and with distance. Our results allow interpreting the model assumptions as micro-based decision rules that can explain the observed mobility patterns of scientists.ISSN:2045-232